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Abstract. For 1 < p < oo and for weight w in A p , wc show that the r- 
variation of the Fourier sums of any function in L p (w) is finite a.e. for r 
larger than a finite constant depending on w and p. The fact that the vari- 
ation exponent depends on w is necessary. This strengthens previous work 
of Hunt- Young and is a weighted extension of a variational Carleson theorem 
of Obcrlin-Seeger-Tao-Thiele— Wright. The proof uses weighted adaptation of 
phase plane analysis and a weighted extension of a variational inequality of 
Lcpinglc. 



1. Introduction 

For a measurable function / on [0, 1], let Sf denote the maximal Fourier sum: 
Sf(x):= sup \(S n f)(x)\ , S n f(x) := £ f(k)e^ kx . 

\k\<n 

Here, f(k) = L f(x)e~ t2 ^ kx dx is the fcth Fourier coefficient, and by convention, 
S n f = for ri < 0. (Here we use strict inequality \k\ < n in the definition of S n for 
the convenience of the transference argument in Section 1.2.) 

By the Carleson-Hunt theorem [2,7], S is bounded on L p for 1 < p < oo, which 
leads to a.e. convergence of the Fourier series of functions in L p . See also Sjolin [22] 
for the Walsh case, and [5, 12] for alternative proofs. More quantitative information 
about the convergence rate of Fourier series has been obtained by Oberlin-Seeger- 
Tao-Thiele- Wright [19], via bounds on a strengthening of S. To formulate this 
strengthening of S, we first recall the r- variation norm of a sequence (a„)„ e z- If 
< r < oo then 



M 



|(an)||v- := sup \aN \ r + 

M,N <-<Nu l £~? 



l/r 



and for r = oo we have ||(a„)|jyoc = sup„ \a n \. It is clear that if ||(a„)||\/r is 
finite for some r < oo then (a n ) is a Cauchy sequence and therefore is convergent; 
the finiteness of ||ct||y may be considered as a quantitative measurement of the 
convergence rate of (a n ). The variational strengthening of S considered in [19] is 
the following operator 

(1.1) S [r] f(x)= sup \j2\ S ^f^)-SN^J(x)f 

M,N <-<N M l ~[ 
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and it was shown in [19] that, for 1 < p < oo, S\ r ] is bounded in L P ([0,1]) if 
r > max(2,p'). 

Convergence of Fourier series in non-Lcbesgue settings was also considered by 
Hunt- Young [8], where it was shown that S is bounded on L p (w) for any A p weight 
w, 1 < p < oo. Sec also [6] for extensions to more generalized settings. Recall that 
a positive a.e. weight w is in A p if uniformly over intervals / we have 



Our aim in this paper is to strengthen the results of [8] and [19] by considering 
weighted estimates for Sm. 

Theorem 1.1. Let 1 < p < oo and w £ A p . Then there is an R = R(p, [w]a p ) < oo 
such that for all r £ (i?, oo] we have 



for some constant C depending only on w, p, r. 

As remarked above, Theorem 1.1 gives more quantitative information about the 
convergence of Fourier series than [8] (which corresponds to the endpoint r = oo). 
Theorem 1.1 follows from 

Theorem 1.2. Let 1 < p < oo and w £ A q for some q £ [Lf>)- Then for 
r > max(2(7, ^r^) it holds that 

(1-3) \\S [r] f\\ LPi 

[0,1], w) <C\\f\\ L 

for some constant C depending only on w, p, q, r. 

Wederive Theorem 1.1 from Theorem 1.2. Let 1 < p < oo and w £ A p . Since 
the A p condition is an open condition, we have w £ A q for some 1 < q < p (see e.g. 
[16]). Then (1.2) follows from applying Theorem 1.2. 

We would like to point out that, in the conclusion of Theorem 1.1, the variation 
exponent must depend upon w £ A p . Indeed, suppose towards a contradiction 
that there is some p £ (l,oo) such that (1.2) holds for every w £ A p and for 
fixed r £ (0, oo). Using the fact that variation-norm decreases as r increases, we 
may assume that r > 1. Then, 5[ r ] is sublinear, and an application of the Rubio 
de Francia extrapolation theorem shows that the same inequality (with the same 
r) would have to hold for w being the Lebesgue measure and all p £ (l,oo), 
contradicting an example in [19, Section 2]. We also remark that in the Lebesgue 
setting when w = 1 £ A\ the range of r in Theorem 1.2 is sharp. 

Our proof of Theorem 1.2 extends our previous work in [3] on a Walsh-Fourier 
model of 5[ r ] and at the same time is a weighted extension of [19]. The proof 
uses two new ingredients: weighted analysis on the Fourier phase plane, and a 
weighted extension of a classical variational inequality of Lcpinglc (Lemma 5.2). 
The weighted adaptation of analysis on the Fourier phase plane in our proof follows 
closely the adaptation in [3], modulo (substantial) technicalities arising from the 
lack of perfect localization of Fourier wave packets. In particular, our approach is 
different from the elegant argument in [8] where a good-A argument was used to 
deduce weighted bounds for S from the Carleson-Hunt theorem. It is not hard to 
see that a naive adaptation of the good-A approach in [8] does not apply to the 
variation-norm Carlcson operator. Wc anticipate that the weighted phase plane 




(1.2) 



||<5'[r]/||£*([0,l],tiO < C||/I|i> 

([0,1], id) 
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analysis in our proof will be useful in a variety of open problems involving weighted 
bounds for multilinear operators with oscillatory nature, where a naive adaptation 
of the approach in [8] seems not applicable 1 . Our approach is inspired by an ar- 
gument of Rubio de Francia [21], though it is easier to see this inspiration in the 
dyadic setting of [3]. 

1.1. Notational convention, (i) Henceforth, we work on the real line R, and set 

f(0 = Jf{x)e- i2 ™t dx. 

(ii) For any 1 < t < oo we will denote by Ait f the L* Hardy-Littlewood maximal 
function, and by M.t,wf the weighted L f maximal function 



M t , w f(x) = sup (— - [ \f(x)\ t w{x)dx) 
r-.xei \w{I) Jj ) 



(iii) The dyadic intervals D will play a distinguished role. We denote by /" the 
dyadic sharp maximal function of /, namely 

dx 



f\x) := sup h(x)\I\- L / / - III" 1 / f(y) dy 
ien J i 



i 



All BMO norms, unless otherwise specified, are dyadic BMO norms, namely ||/||bmo 
|| /"||oo. An important inequality for this paper is the familiar estimate 

(1-4) IMUj-H - H^llwO) ' weA p . 

(iv) For any interval I and c > we denote by cl the interval with length c|/| and 
with the same center as /. This should not be confused with c(I) which will denote 
the center of I. A standard property of an w £ A p weight is that it is doubling. 
There exists 7 = "f(w) such that for any interval / and any k > it holds that 

(1.5) w{2 k I) < 2i k w(I) . 

(v) For any set G we denote w(G) = J G w(x)dx. 

1.2. Transference to a singular integral form. Using a weighted variant of a 
transference argument in [19, Appendix A], it is not hard to sec that Theorem 1.2 
follows from Theorem 1.3 stated below. In Theorem 1.3, we define 

(1.6) c [r] f(x)-.= sup my 2 ™tda\ r 

K.N a < - <N K Vj— ; JNi-i J 



l/r 



Theorem 1.3. Let 1 < p < 00 and w G A q for some q G [l,p)- Then for 
r > max(2g, ^r^) it holds that 

(1-7) \\C[ r ]f\\LP(MAu) < C||/||l,P(K,«;) 

for some constant C depending only on w, p, q, r. 

For the reader's convenience, we include details of the transference argument. 

For any K > 1 and m > 1, let I m: K be the set of all non-decreasing sequences 
of length K + 1 in {0, . . . , m}. For each such sequence N = (No < ■ ■ ■ < Nk) we 
construct the variation sum 

K 

(1-8) S$f = C£\S Ni f-S N ._jn 1/r ■ 



^We would like to point out that Xiaochun Li [15] has some unpublished results about weighted 
estimates for the bilinear Hilbert transform. 
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Since the set I m ,K is bigger when m or K is larger, by two applications of the 
monotone convergence theorem it suffices to show that 

SUp <%/||z,e([0,l],w/) < C||/IUp([0,1],«') i 

Nei m , K 

where the implicit constant is uniform over m and K . Let a = w^~ v . Then the 
above inequality has the following equivalent dual form: for / defined on [0, 1] and 
for g defined on [0, 1] x 7 m x {1, . . . , K} (we will write g^ ,{x) to denote g(x, N,j)), 

ri K f 1 

/ f( x ) E E [( Sn J - S *j-i)9f},j\( x ) dx 

N£l m . K J = 1 



K 



(i.9) <c\\f\\ LP{lQ , lhw} \\ E (EW') 1/r ' 

N<£l m , K 3 = 1 



L*' ([0,1],*) 



To prove (1.9), we may assume without loss of generality that / and g^ . are 

trigonometric polynomials for any TV € I m and 1 < j < K . 

For any N > let be the Fourier multiplier operator on L 2 (M) whose symbol 
is the characteristics function of {— (JV— 1/3) < £ < N — 1/3} (by definition CV = 
if N < 1/3). Let 6(x) = e-^ 2 and S M (x) = 5{x/M). 

By standard transference theory (see e.g. [23, page 261]), for any integer N and 
any 1-periodic trigonometric polynomials P, Q we have 



P{x)SnQ{x)<Ix = lim — 

M—^oo Ad 



P{x)8 M / a C N {5 M /pQ){x)dx , 



for any a,(3 e (0, 1) such that a 2 + f3 2 = 1. We take a — p = l/\/2. It follows that 
the left hand side of (1.9) is the same as 



1 t K 
Mm — / f(x)6 M/a (x) E £[(CW,-C*,-i)(* 



(x)dx 



Nei m i=i 



It follows from Theorem 1.3 that the analogue of (1.9) for Cat's holds, thus the 
above limit is bounded above by 



(1.10) < C lim sup — \\f8 M / a \\ L p(n w ) 

M->oo M 



Lp'I 



Since w € A q C -A p , we have cr = e A p i and in particular both w and cr are 

doubling weights. On the other hand, it follows from exponential decay of S that 
for any doubling measure fi and any 1 < q < oo and any 1-periodic function h 



sup T7n^\\ SMh \\L"(^^) < C|i^lli<3([o,i],M) 



Using this observation, (1.9) follows immediately from (1.10). 
We take up the proof of Theorem 1.3 below. 
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2. Discretization 

In this section we reduce the task of proving (1.7) to proving similar bounds on 
model operators. Consider absolute constants 62 G [l,oo) and 63 G (0,6*2) and 
6*2,1) 6*2,2, Ci in [62,00). Constants with these properties are called admissible. 

2.1. Tiles and bitiles. In this paper, a tile is a dyadic rectangle of area 1, which 
we will write p — I p x ui p and refer to I p as the spatial interval and uj p as the 
frequency interval of p. By a bitile P we mean a rectangle ip x ujp that contains 
(as subsets) two tiles Pi and P2 such that they share the same (dyadic) spatial 
interval Ip and 

s\xppC 2 uip 1 <infC 2 uJp 2 , \ujp\ < Ci{\uj Pi \ + \ujp 2 \) , 
ujp = convex hull(62.iwp 1 U 62,2^^) ■ 
The classical setting (see e.g. [12]) when a bitile is a dyadic rectangle of area 2 is 
the special case of our general setting when 62 = 62,1 = 62,2 = 61 = 1. 

We say that two bitiles P and P' are disjoint if they are disjoint in the phase 
plane. Denote by ujp the convex hull of C2Uip 1 U C2Uip 2 , clearly up C u>p. In 
this paper, whenever we talk about a bitile collection it shall be assumed that the 
implicit constants above are the same for any two bitiles. 

2.2. Fourier wave packets. For every tile p — I p x u> p , a function <j> p is called a 
Fourier packet adapted to p if supp(4> p ) C 630^, furthermore for any N > and 
n > it holds (for some Cn,u depending only on N and n) that 

(2.1) i^wis^^d + fe^M)-^ 

here recall that c(I p ) denotes the center of I p . In a family of Fourier packets, we 
will assume that the involved implicit constants are uniform. 

2.3. Discretization and the model operators. For any r <G [l,oo) and any 
finite collection P of bitiles, let 

K 

C r ,p/:= sup ( V| V (/^p^^l^.^^.JV^e^jr) 
K,N < -<Nk k ~[ p~p 1 

A symmetric variant of C r p can be obtained by changing the limiting condition 
involving Nj, Nj-i in the above definition to {Nj-i £ u>p 1 , Nj $ uip}- 

Without loss of generality, we assume in the rest of the paper that 2q < r < 00 
and q G (l,oo). Via a discretization argument in [19], which we summarize below, 
Theorem 1.3 follows from the Theorem below and its symmetric variant (whose 
proof is completely analogous). 

Theorem 2.1. There is a constant C < 00 independent of f and P such that 
(2-2) ||6V,p/||i,p(u)) < G\\ /|| lp(w) 

for any finite collection P of bitiles and any p G (q, 00) such that 1/p < 1/q — 1/r. 

Discretization. We sketch the main ideas of our weighted adaptation of the dis- 
cretization argument in [19, Section 3]. For each interval (a,b) with non-dyadic 
endpoints, let J be the collection of maximal dyadic intervals in (a, b) such that 
dist(J, a), dist(J, b) > \J\. It is not hard to see that J partitions (a, b), and the 
ratio between two adjacent elements of J are at most 2. By direct examination, it 



1/r 
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follows that there are O(l) possible mutually exclusive scenarios involving relative 
locations of J inside (a, 6), and these scenarios are characterized by the following 
information: 

• whether J is the left or right child or its dyadic parent, 

• the distance from a to J, which could be arbitrarily large, 

• the distance from b to J, which could be arbitrarily large. 

More specifically, we may divide J into O(l) disjoint subsets of the following type: 
If m, n, k are bounded positive integers and side is left or right then we denote by 
<fk,m,n,side the set of all dyadic intervals J such that J is the side-child of its dyadic 
parent, and a G Ji ow (k,m) and b £ Jhigh(k,n). 

• If k = 1 then J low = J - (m + 1)|J| and J hlgh = \ J\ + (n + 1)\J\. 

• If k = 2 then Ji ow = J — (m + 1)|J| and J high = [supj + n\J\, oo). 

• If k = 3 then = (— oo, inf J — m|J|] and J^g^ = \ J\ + (n + l)\J\. 
The following example of such a partition was given in [19], we include this 

example for the convenience of the reader. Below are the values of {k, m, n, side): 



Since the relative ratio between adjacent intervals in J are bounded by 2, we may 
construct nonnegative L°° normalized bump functions ipj such that lt a ,b)(£) = 
Sjej Vj(£)> furthermore ip j is supported inside a (1 + c) dilation of J for each 
J G J , here the absolute constant c > can be taken arbitrarily small. By 
using a standard Fourier sampling theorem for the Schwartz band-limited function 
^ r_1 (/(^)y / ^7) (cf. [25]) we can easily decompose 



for some positive integer L — 0(1) where </>/ x j(0 := |I| 1//2 -\/^7(l)e _27r * c ^^- Note 
that the frequency support of 4>ix.j is inside a (1 + c) dilation of J with c > 
can be chosen small. Furthermore, it is clear that the collections of functions 
{4>ix.j '■ \I\ = 2~ L | J| _1 ) can be decomposed 2 into O(l) families of Fourier wave 
packets adapted to the tiles in the phase plane. 

Let Pside denote the collection of all dyadic rectangles of area 2 _L whose fre- 
quency interval is the side-child of its parent. Then 
r b 3 



here the intervals L p {k, m) and U p (k, n) are the Ji ow and Jhigh of J = u> p . 

Now, under the assumption that / is Schwartz, it is no loss of generality to 
assume that the sequences (-/V < • • • < Nk) (used in the definition of Cr r i) does 
not contain endpoints of dyadic intervals . Performing the above partition on every 
(Nj-i, Nj), it then follows from the triangle inequality that 



{(1, 2, 1, left), (1, 2, 2, left), (1, 3, 1, left), (1, 3, 2, left), (2, 1, 1, left), 

(2, 1, 1, right), (2, 2, 1, right), (3, 4, 1, left), (3, 3, 1, right), (3, 4, 2, left)} . 



1/1 = 1/(2^171) 





This decomposition ensures that there is only one wave packet associated with each dyadic 
rectangle of area 1. 



WEIGHTED BOUNDS FOR VARIATIONAL FOURIER SERIES 



7 



K 

Ck,m,n,sidef(x) := SUp (S~] | V, (/, <t>p) <t>p [x) 1{ Nj _ 1 eL (fc >m ) N . £U p (k,m)} D • 
KAN A , 77T 

It is not hard to see that for each 1 < m,n = O(l), we can bound C^^ n ^ n f{x) 
by a sum of Ol(1) operators of the same nature as CVp, with appropriate choice 
of admissible constants C\, C%, C2.1, £2,2 and C3. Similarly, C2, m ,nf{x) can be 
bounded by a symmetric variant of C rj p. Since any interval [a, b) can be written 
as (—00,6) \ (—00, a), it is not hard to see that Ci imj „/(x) can be controlled by 
two operators of the same nature as C^ im<n f{x). Thus, Theorem 1.3 follows from 
Theorem 2.1. This completes the discretization step. □ 

Below we set up a linearized variant of C r ,p- By duality in £ r , to show (2.2) it 
suffices consider the following operator (we omit the dependence on r for simplicity): 

K(x) 

(C P f)(x)=22 (/><M)<M ( x ) 1 {N i - 1 {x)^ UP , Ni(x)eup 2 }dj(x) , 
j=\ PeP 

here K : K —> Z + , Nq(x) < ■ ■ ■ < Nk(x) and {dj} are measurable functions, with 

Id^xW' +--- + \d K{x) (x)\ r ' =1 . 

For each bitile P, let dp(x) be unless there exists a (clearly unique) j such that 
Nj-i(x) ^ tup and Nj(x) € up 2 , in which case we set dp(x) = dj(x). For a function 
<?, we note that (Cpf,gw) = Bp(f,g), where 

Bp{f,g) ■= (/'<M)(<M d P . 3 W ) ■ 
Pep 

We say that G' c G is a major subset if w(G') > w(G)/2 and we say G' has 
full measure if w(G') = w(G). Via a standard restricted weak-type interpolation 
argument, Theorem 2.1 follows from the following proposition: 

Proposition 2.1. Let F, G be such that w(F), w(G) < 00. Then there are major 
subsets of F and G, denoted respectively by F and G, such that: 

(i) at least one subset has full measure, and 

(ii) for any \ f\ < 1^ and \g\ < lg and any finite collection of bitiles P we have 
(2.3) B F (f,g)<Cw(F) 1 ^w(G) 1 - 1 / p 

for all p £ (q, 00) such that 1/p < 1/q — 1/r. 

In the rest of the paper, we will prove Proposition 2.1. 

3. Decomposition of bitile collections 

Without loss of generality we may assume the following separation conditions: 

(51) The ratio dist(u;p 1 ,wp 2 )/|wp 1 | is constant over PeP. 

(52) For any two bitiles P and P', if cop n ojpi 7^ and |7p| = \Ip>\ then 

LUp = UJpi . 

(53) For any two bitiles P and P', if \Ip\ > \Ip>\ then \u)p\ < \(Jp'\/Ko for some 
large absolute constant K$ that will be chosen in the proof. (The choice of 
Kq is refined a bounded number of times below.) 

Remark 3.1. First, we will require that Kq > q^Lq ■ This means that for any 
1 < i < 2, if C 3 ujp i n C 3 ujp> ^ and \I P \ > \I P /\ then uj p C C 2 wp' 
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3.1. Trees. In this paper, a finite collection T of bitiles is a tree if there exists a 
dyadic interval It and a real number £t such that for any PgT we have 

I P C I T and uj t := [£r - ttttt^t + ttttt) c ^ p 

z\±t\ z \ 1 t\ 

It will be referred to as the top interval of T . Similarly, £t and u>t will be referred 
to as the top frequency and the top frequency interval of T. 

We say that T is 2 -overlapping if £y € C2UJP2 for every P £ T, and we say that 
T is 2-lacunary if £x C2Wp 2 for every PeT. 

It is clear that any tree can be split into two trees, one of each type. Furthermore, 
the union of two trees with the same (It,£t) is a tree and we may use the pair 
(It, £t) for the new tree. If these two trees are 2-lacunary then the new tree is also 
2-lacunary. 

Remark 3.2. By further requiring that K > 2 Ci+i * n ^ ne se P ara tion assumption 
(S3), we obtain the following properties (cf. Remark 3.1). Let T be a tree and let 
P, P' G T be two different bitiles. 

• If \I P \ = \I P >\ then I P n I P , = 0. 

• If T is 2-overlapping and \lp\ > \Ip> \ then top n C^ujp' = 0. 

• If T is 2-lacunary and |ip| > |/p'| then wpfl C^ujp^ = 0. 

Remark 3.3. If there is a dyadic interval J such that for every P G T we have 
Ip C J then we can decompose T into 0(1) subtrees, each tree has J as top interval 
(the top frequencies of these subtrees are not necessarily the same, but they are 
0(1/| J|) away from the original £r). Essentially, this is because we would have 
\^p\ > rjr an d then one can always partition T into two desired trees depending 
on the relative position of £y in uip. 

3.2. Tile norms. Below, for any collection Q of bitiles we denote 

Sq/(x):= [2^ — jj-j — 1/> 

PGQ 1 1 

Definition 3.1 (Size). The size of a collection P of bitiles is 
size(P) := sup w(I T )~^\\S T f\\mw) ■ 

TCP 

The suprcmum is over all 2-ovcrlapping tree T C P. 

It is clear that for w = 1 one recovers the standard definition of size (cf. [11]). 
For any interval J, let 



-1/2 



Note that if J C I then \J < and this estimate will be used implicitly in 

future estimates. 

Definition 3.2 (Density). Recall the definition of the functions dj from (2.3). Fix 
a large constant D G (0, oo). The density of a collection P of bitiles is defined to 
be 



T 



density(P) := sup ( --^ / x? T \g\ r ' E 1^1^ 



here the suprcmum is over nonempty trees TcP. 
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Choose D to be very large depending on w,p, q, r in the proof of Proposition 2.1 
in Section 6 (see also the proof of Lemma 3.11). All the implicit constants are 
allowed to depend on D. 

When the elements of P are disjoint in the phase plane, the following improved 
notion of density is more useful in future estimates, see also Lemma 4.2. 

Definition 3.3 (Improved Density). The improved density of a collection P of 
bitilcs is defined to be 

de^Tty(P) := sup f x? p \g\ r ' £ Idj^^ ■ 

It is clear that density(P) < Cdensity(P) for any P. 

3.3. Decomposition by size. We have the following size bound: 

Lemma 3.4. Assume w € A q . Then for any N > there is a constant C = 
C(N, q, w) < oo such that for any P 

1 f m \ 1 l q 



size(P) < C sup (— ±— [ |/| 9 x^ 

PEP Vw(ip) J 



The main ingredient in the proof of Lemma 3.4 is the following John-Nirenberg 
characterization of size, which is a standard result in the Lebesgue setting (see e.g. 
[18]). The proof of the Lebesgue case of this characterization extends smoothly to 
the weighted setting (see [3, Lemma 3.5]), we omit the details. 

Lemma 3.5. For any 1 < p < oo and any collection P we have 

SUP (J \hA S t!^Lv{ w ) ~p Sup \\S T f\\L^°°(w) 
TCP w{I T ) l /P TCP W(It) 

the suprema are over all 2- overlapping trees. 

Proof of Lemma 3.4 using Lemma 3.5. By decomposing T into smaller subtrees 
(using Remark 3.3), we may assume that It = Ip for some P € T. Thus, it 
suffices to show that 

\\STf\\Li(w) < CWfXl^WLiiw) ■ 

But w £ A q , hence ||<St/||l<i(«,) ^ IK'St/^IIl^u,)- Therefore it suffices to show that 
for any N < oo we have 

(3.4) (S T f )* <GMi(/x£) . 

For any dyadic interval J let 

/ \- !(/. <M)I 2 m/2 

CJ = ( £ \i P \ ] ■ 

P£T:JClp 1 1 

Then 

— JjS T f(x)-cj\dx < (—jjS T f(x) 2 -c 2 j\dx) 

= t^\\( £ i(/.^)i 2 fe) 1/2 ii 2 ■ 

1 1 PeT:IpCJ 1 ^' 

Using the known Lebesgue case of Lemma 3.4 (see e.g. [18, Lemma 6.8]), we obtain 

T^r / \S T f(x)-cj\dx<C sup J-f f \f(x)\xi P (x) N+i dx 
\ J \ J J PeT-.ip<zj \lp\ J 
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< C inf M^fxDix) , 
xeJni T 

and (3.4) follows immediately. □ 
We remark that the following bound was proved in the above proof of Lemma 3.4: 

Corollary 3.6. Assume w G A q . Then for any 2- overlapping tree T and any 
N > it holds that 

WStIWbmo < C N inf Mx{fx%){x) 
here we use the dyadic BMO norm. 

For convenience, in the rest of the paper we say that a collection T of 2- 
ovcrlapping trees is well-separated if the following conditions are satisfied: 

(i) If T, V G T arc two different trees, and P G T and P' G V and \1 P \ > \I P ,\ 
then either C^uJp 1 n C^top^ = or I pi n It = 0- 

(ii) If P, P' G UreT ^ are two different bitiles with \Ip\ = \Ip> \ then I P x C^wpj 
and Ipi x C^uip^ are disjoint. 

Lemma 3.7. Let P be a collection of bitiles with size bounded above by 2a, some 
a > 0. Then we can find a collection T of trees such that: 

• The bitile collection P — IJtgt T has size less than a. 

• If another tree collection T' covers IJtgt T then for some C = C{w) < oo 

(3.5) 5Z w ( /t )- c w ( /t ') ■ 

TGT T'GT' 

• If qo G (9,00) £/ien i/iere exists ft = j3(p,w,q,qo) < 00 such that for any 
k > and /or an?/ 1 < p < 00 we have 

(3-6) || 2 W|| L ,(„) < C2^a-^||/||^ 0(u;) . 

TGT 

i/ere C = C(p, w, g, go) < 00. 

Proof. For convenience let ap = (/, i^}. We follow the standard algorithm from 
[12]. If size(P) > a then there exists a non-empty 2-overlapping tree T2 C P such 
that ||<St 2 /||l2(u,) > a 2 w(Ix 2 ). We select such a tree with minimal value of £t 2 3 ; 
and let T be the maximal tree in P with top data (It 2 > £t 2 )- We then remove from 
P the bitiles in T and repeat this argument until the remaining collection of bitiles 
has size less than a. We obtain a collection T of trees such that 

• P Utgt T nas s ' ze l ess than a; 

• Each T G T contains a 2-overlapping subtree T2 such that 

(3.7) w{I T ) < Ca- 2 \\S T2 f\\l Hw) = Ca~ 2 £ |a P | 2 ^ . 

pgt 2 1 p| 

It then follows from a standard geometrical consideration that the tree collection 
T2 := {T-2 : T G T} is well-separated when the constant K in (S3) is chosen 
sufficiently large (see also Remark 3.1). We omit the details. 



^To be more careful, one can fix a top frequency for each of these trees, and then select one 
tree (there are only finitely many of them) whose top frequency is minimal. 
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Proof of (3.5): Assume that T' covers Q := Utst-^> without loss of generality 
we can assume Ur'eT' = Q- Let Q2 = UreT-^ 2 - ^ follows from (3.7) that 

(3.8) £ «,(J T ) < Co- 2 X: I«p| 2 ^ • 

Now, divide each T' G T' into three trees, 

U = {P e T' : inf C 2 u Pl < £r< < supCWft} , 
7;' = {Fef: supCsWpj < Ct' < inf C 2 cjp 2 } , 

^ = {FeT':^eC* 2WP2 } . 

Clearly, T% is 2-overlapping. Since sizc(P) < Ca, we have 

(3.9) \(f^ri)\ 2 ^frr = WStM'M < CaMlT>) ■ 
PeT^ ' p ' 

On the other hand, since T2 is well separated, the rectangles Ip x [inf CWpl , sup C^uip^ 
with P G Q2 are pairwise disjoint in the phase lane. This implies that the bitiles 
of Tq n Q2 are spatially disjoint (since their frequency intervals overlap). Thus, 

(3.10) ]T \{f,M)\ 2 ^jpr < E ^e({P})Mlp) < Ca 2 w(lT>) ■ 

Next, we show that T[ n Q2 can be grouped into 0(1) collections of 2-overlapping 
trees whose top intervals are disjoint. Together with the given assumption on the 
size of P, this would imply 

(3.11) £ \(f,4> Pl )\ 2 ^<CaMlT.) ■ 
PeT[r,Q 2 1 p| 

Let M be the set of elements of T[ n Q2 with maximal spatial intervals. The 
grouping of elements in T{ n Q2 can be done as follows: 

• Any element P G M can be viewed as one 2-overlapping tree, and we place 
these single-element trees in to the first tree collection. 

• For any P G M, we show below that we can place every P' G T[ n Q2 such 
that Ipi C I P in O(l) trees sharing the top interval Ip. 

Since the interval {Ip. P G M } are disjoint, it remains to show that if P' G T[ H Q2 
and I pi C Ip then 

(3.12) inf C^Wp^ < supC2Wp 2 < sup C^^p^ . 

Indeed, since \u>p^\ = yrT\ — JT^\ ^ follows from (3.11) that we may take — 2j7>[ + 
supC2Wp 2 or 9|7^ + supC2Wp 2 as the top frequency for these trees. 

To see the first inequality in (3.12), we assume (towards a contradiction) that 
supC2Wp 2 < inf C2WPJ. By the selection algorithm, the 2-overlapping tree S G T 2 
that contains P must be selected before the 2-overlapping tree S' of P' . Now, by 
definition of T[ we have 

[sup C 3 u} Pl , inf C 2 ujp 2 ) n [sup C 3 u} P ^ , inf C 2 wp^ ) 7^ 

(they both contains £t' ) • On the other hand, by ensuring the constant Kq is suffi- 
ciently large in the separation assumption (S3), we have wpC convex hull(C2Wp' n 
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PET' 



C2Wp^). But then P' must be cleared out as part of the maximal tree with the 
same top data as S, leading to a contradiction. This proves the first half of (3.12). 
To see the second inequality in (3.12), as before exploit the fact that 

[sup C 3 ujp 1 , inf C 2 up 2 ) n [sup C 3 oj p ^ , inf C 2 uj p ^ ) ^ . 

By ensuring the constant Kq in the separation assumption (S3) is sufficiently large, 
we have |w.f»| > |wp 2 |. As a consequence, if supC 2 ujp 2 > supC 2 ujp^ then the in- 
terval [supC 3 ujp 1 , inf C 2 ujp 2 ) will be above inf C 2 ujp^, contradicting the nonempty 
intersection. This completes the proof of (3.12) and hence (3.11). 
Finally, collecting inequalities (3.9) (3.10) (3.11), we obtain 

" \(f,<t>Pi)\ 2 ^jf^r < Ca 2 w{I T >) . 

Summing over T" £ T' and using (3.8), we obtain the desired estimate (3.5). 

Proof of (3.6): Fix k and let 

A/M := J2 ■ 

TGT 

It suffices to show the following good lambda estimate: given any L £ (0, oo) there 
exists co £ (0, oo) and c £ (0, oo) such that 

(3.13) w({N [k] > A} n E^ ) < yw({N [k] > A/4}) . 

(3.14) where E [k] := {M 2q . w f < c2- c ° k a\^} . 

Indeed, choosing L sufficiently large (depending on p £ [l,oo)) and applying a 
standard bootstrapping argument, we obtain 

II £ h T h Hw) < C2°Wa- 2 «°\\M 2q ,Uf)\\% l0(w) 
TeT 

<C2°Wa- 2 «°\\f\\ 2 £ mo(w) , 

as desired. Here we have used the fact that Mi yW is bounded from L l {w) — > L l (w) 
for any 1 < t < oo and any positive weight w; note that we always have 2pq Q > 2q. 

To prove (3.13), we use the following estimate which follows from Lemma 3.8 
(see the remark after the Lemma): for any dyadic interval / and qo £ (9,00) it 
holds that 

(3.15) w({N\ k] > A/4}) < C2°^a- 2q X-^w(I)[M M 2q . w (fx?)(x)] 2q 

:= £ l lT . 

T£T:I T CI 

Let I be the collection of all maximal dyadic intervals of {A^ > A/4}. We apply 
(3.15) to elements of I that intersect E^K Let I be one such interval, then it follows 
from the maximality of I that {A^ > A} n / is a subset of {N [k] > A/4}. Thus, 



({AW > A} n I) < C2°^ \a~ 2q \~™ w{I)} [c2- Cofc aA^ 



2<; 
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and by choosing c sufficiently small and Co sufficiently large we obtain 

w({nW > A} n I) < Cc 2q w(I) < ^p- . 
Summing the above estimates over all / G I that intersects we obtain (3.13): 

w ({nW > a} n E [k] ) < w({N [k] > A} n /) 



<^ W (J) = k^>A/4}) 



□ 



(3.18) w({N^ > A}) < C2 0{ ^w{I) fa -1 A _3 « inf M 2q>w (fxi )(x) 



Lemma 3.8. Let / &e an interval and let T be a well-separated collection of 2- 
overlapping trees such that for any T G T we have It C / , and 

(3.16) w{I T )<Ca- 2 \\S T f\\ L , {w) . 

Then for any qo G (q, oo) and N > t/iere is C = C(qo, w, N) < oo swc/i that 

(3.17) W ({7V[ fc l > A}) < C2°« [a-^-'fe Hl M -)] , 

TGT 

27ie implicit constant in 0{k) depends on w and q. 
Remark: As a consequence of (3.17), we obtain 

x£l 

Proof. Since N*- k ' is integer- valued, without loss of generality we may assume A > 
1/2. We estimate 

(3.19) w({nW > A}) < W ({ 2 ' A < N[k] ^ 2 i+1 A}) 

z>o 

and it is not hard to see that 

(3.20) w{{2 l \ < NW < 2 l+1 \}) < w({Nj k] > 2 l X}) 

where := l 2 fej T , 

TGT, 

and T, :={T£T: 2 k I T (t {N [k] > 2 l+1 X}} 
Write Ni for Tvf 1 , clearly N t < ivf 1 for k > 0. We hrst show that 

(3.21) Halloo <2 i+1 A . 

Indeed, take any x, and let T x = {T G T/ : x G It}- Clearly, 

N t (x) < h"i T ■ 
tgt x 

Since the collection of top intervals of elements of T x is nested, there is one minimal 
clement. Note that if I\ C I2 are two intervals then for k > we have 2 k I\ C 2 k l2- 
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Therefore the intervals 2 k Ix with T £ T x are also nested and the minimal of them 
contains a point y £ {N^ < 2 Z+1 A} by definition of TV Therefore, 

# W (tf)<2 ,+1 A , 

completing the proof of (3.21). 

Now, denote P; = UtgTz ^ anc ^ as usua l 

Sp,/ = (E K/.^>l a ^r) 1/a ■ 

It follows from (3.16), (1.5), and Holder's inequality that 

(3-22) a 2 ^N^\\ LHw) <C2^\\S P J\\% q(w) . 

For N large let // = fxi- The key estimate in our proof of (3.17) is 

Claim 3.9. For any s £ (0, 1) and S > there is C — C(e, s, N) < oo such that 

(3.23) (5 Pl /)« < CWmWt^Mih + [aM 2 (N?)] s (M 2 f I ) 1 - s ) . 

Below we show (3.17) using the above claim. It follows from (3.22), (3.23), and 
the assumption w £ A q that 

a\\Nl k] \\f 1{w) <C2°^\\(S Pl fn L i Hw} 
< ^WIIATjII^^I/xIU^^ + [«||iV z 1/2 |U 2g(to) ] S ||/ z ||^ (uj) ) 

Here 5 > and s > will be chosen very close to 0. Consequently, after bootstrap- 
ping, it follows that for any e > 

a||JV I [fc] ||* (w) <C72 o ( fc )||JV I || 6 / 2 «||/r|U a . (lB) . 
Therefore, it follows from the bound ||AVI°o < 2 l+1 X of (3.21) that 

w({Nl k] > 2 l X}) < C2° {k) 2~ l ^~^a- 2q X- 1+e w{I)[m{ M 2q . w f{x)] 2q 

x€l 

Choosing e > very small allows for summation over I > of the above estimate. 
Using (3.19) and (3.20), we obtain the desired estimate (3.17). 

Proof of Claim 3.9: Fix any dyadic J. For any T £ Ti let Tj := {P £ T : I P C J}, 
and by decomposing Tj into O(l) subtrees we may assume that Tj is a tree with 
a new top interval It fl J for every T £ T; . It suffices to show that for any x £ J: 

( 3 - 24 ) FmTa^ ^ \ S Tjf\ 2 ) 1/2 \\^ < the value at x of RHS of (3.23) . 

By Lemma 3.10, for any < s < 1 there is C = C s < oo such that 

(3.25) ( ]T ||St. 7 /||^) 1/2 < C||/|| a + Ca°\\Nt /2 \\l 2{J) \\f\\l 

tgt, 

Here we've used the fact that for any P £ P 



— s 

2 
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Since for any P <E Tj we have Ip C I fl J, it follows from Corollary 3.6 that 

(3-26) IISt./IIba/o < C inf Mi{fjff nJ )(x) . 

xEinJ 

Interpolate the estimates (3.25) and (3.26) to prove (3.24) using a now-standard 
localization argument (see e.g. [13]). The idea is to decompose / = J2 k>0 fk where 
fo = A/n,7 and f k = /l 2 *(jnJ)\2'»- 1 (/nJ) for fc > 1 and apply (3.25) and (3.26) to 
fk- More specifically, for p £ (2, oo) we have 



TGT, 

< 



i(E \ s Tjk\ p ) 1/p \\ P = (J2 W S TJk\\ P p 

TGT, TGT, 

< ( E II^MH) 1 sup WStJhWbmo 
- / tgt, 



C N ^- Ifk \InJ\ 1 ^ ai M j (M 2 fi(x) + [aM 2 (Nl /2 )(x)] » [^ 2 //(x)] 1 * 
Summing over k > we obtain 

ikE i^/n 1/p 



TGT; 



(3.27) < C\J\V» inf (^f 2 /j(x) + [aM 2 (N} /2 )(x)]f Ma/j^) 1 "^ 
On the other hand, using Holder's inequality it follows that 

(3.28) ||(E \Sr 3 h?) x/ X<\Wi\\~'\\(Y, \ S T.J\ P ) 1/P \\ P ■ 

tgt, tgt, 
Combining (3.27) and (3.28) and use Holder, it follows that 

I I TGT, |J| TGT, 

<C\\N l \\t^[M 2 f I (x)+[aM 2 (VN l )(x)]^[M 2 fi(x)] 1 ~ 2 " . 
Choosing p > 2 sufficiently close to 2 we obtain the desired estimate (3.24). 



□ 



The following Lemma, needed for our proof of Claim 3.9, is contained implicitly 
in [24], where in fact a stronger logarithmic variant was proved (see also [9] for a 
vector valued generalization). 

Lemma 3.10. Let T be a well-separated collection of 2- overlapping trees and let 
P = Utgt T ■ Then for any < s < 1 it holds that 

(Ek/>^>i 2 ) 1/2 

</>Pi)l ^^ | 7^1^1/2" 
TeT 

Remark: While any < s < 1 would be enough for applications to the Lebcsgue 
setting of Carlcson theorems (see e.g. [12] and [19] where s — 1/3 is used), our 
applications to Claim 3.9 require arbitrarily small s > 0. We include a proof of 
(3.29) (following largely [24]) below. 



(3-29) <C s (||/|| 2+ [sup^^(E|/t|) 1/2 11/11." 
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Proof. Without loss of generality we may assume ||/||2 = 1. Denote 
N=J2 1 ir , a P = (f,4>P l ) , 



TeT 



We then divide P into subcollections Pfc, where for any k > we have 

P fc = {P E P : 2- k - 1 B < ^ 2 ~ kB }> 

and let P>fc = Uj>fc Pj- Using the known special case s = 1/3 of (3.29) proved in 
[12] (see also [19] for a setting similar to the current paper) for the restriction to 
P>A: of the tree collection T, we have 

( E \ap\ 2 ) 1/2 <C + C(2- k B) 1 / a \\N\\{ /6 

P6P>t 

in particular for k > max(0,log 2 [B(^2 TeT 1/tI) 1 ^ 2 ]) we have 

(3.30) ( ]T M 2 ) 1/2 <^ • 

PeP> k 

On the other hand, it follows from the definition of Pfc that 

(3.31) (£ |«p| 2 ) 1/2 -2-^(^ \I P \)^ . 

PeP k PeP k 

We can also view Pfc as a collection of single-bitile trees, which is clearly well- 
separated. Thus again using the known case s = 1/3 of (3.29), it follows that 

(3.32) ( l fl p| 2 ) 1/2 <C + c\2- k B( ^ |/p|) 1/2 ] 1/3 

PeP k PeP k 

Combining (3.31) and (3.32), it follows that for any k > we have 

(3.33) ( J2 M 2 ) 1/2 < C 

PeP k 

Using (3.30) and (3.33) we obtain 

Wp\ 2 < C + Cmax(0,log 2 [B\\N\\ \ /2 ] ) 

PGP 

Using the trivial estimate max(0, logx) < x for any x > 0, we obtain 
(J2\a P \ 2 ) 1/2 <C(l+[B\\N\\y 2 Y) 

PEP 

for any < s < 1, as desired. □ 
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3.4. Decomposition by density. Since \g\ < la, the density of any collection 
is bounded above by 1. For the result below, it is important that the constant 
D in the definition of density is sufficiently large, much bigger than the doubling 
exponent 7 of w. We return to this point in the proof. 

Lemma 3.11. For any collection P of bitiles and any a > we can find a collection 
T of trees such that the density of P — UreT * s bounded above by a and 

E w ^t) < Ca~ r 'w(G) 
TeT 

here r is the variational exponent used in the definition of density. 

Remark: This is a weighted extension of [19, Proposition 4.4], and the proof 
below is adapted from [19], which is in turn a variational adaptation of the standard 
argument. The variant of Lemma 3.11 with improved density follows immediately, 
since for any P we have density(P) < C density '(P). 

Proof. If density (P) > a then there is a nonempty tree T C P such that 

(3.34) W (J r ) < a- r ' J x? T \9\ r ' E \ d i\ r ' w ■ 

j:Nj £u T 

We select T such that \It\ is maximal, and then by enlarging T (keeping It and £y) 
if necessary we may assume that T is maximal in P with respect to set inclusion. 
Let T_|_ and T_ be the maximal trees in P with the same top interval as T but with 
top frequencies £t — 2\TT\ an< ^ £ T + 2p7| respectively. We then remove from P the 
union of T, T+,T_. Continuing this selection process, which will stop since P is 
assumed finite, we obtain a collection T of trees, such that 

density(P - (J (T U T_ U T+)) < a . 
TeT 

It remains to show that 

E w{I T ) < Ca- r 'w(G) . 
TeT 

By the selection algorithm, it is not hard to see that for T ^ T' in T the rectangles 
It x lot and It' x cjt' are disjoint. Now, it follows from (3.34) that for any TeT 
there exists an integer k = k(T) > such that 

(3.35) u(I T ) < C2~ Dk a' r ' f \g\ r ' E \ d ^' w - 

We then sort the trees in T according to the value of k(T). More specifically for 
each k > let T k = {T e T : k(T) = k}. It suffices to show that 

(3.36) E w ^t) < Ca- r '2- k w(G) . 

TeT fc 

Fix k. Select a subcollection Sk C T^ such that the rectangles 2 k Is x u>s with 
S € Sfc are pairwise disjoint, and such that 

(3.37) E u ^t) < c E u ( 2k+2l s) ■ 
TeT fc ses k 
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Note that this will imply the desired estimate (3.36). By choosing D > 7 + 10, 
where 7 is the doubling exponent for w, it follows from (3.35) and (3.37) that 

J2 "(It) < C2 kl J2 ^s) 
TGT k ses k 

<C2~ k a- r ' /EE 1 {(x,N j ( x ))e2"i s xu JS }\dj\' r '\g\ r 'w 
j ses k 

< C2- k a- r ' f \g\ r ' J2 \dj\ r 'w < C2~ k w{G) . 

It remains to select Assuming without loss of generality that TV- is nonempty 
Then we choose S € such that \Is\ is maximal and then remove all T € T if 

2 k I T xuj T n 2 k I s xw s? ^| . 

Starting from the remaining collection, we repeat the above selection procedure 
until no trees are left. We then let be the collection of selected trees. For any 
S £ St, let Tg denote the collection of trees in T that are removed after S is 
selected, then to show (3.37) it suffices to show that 

(3.38) S ' lj T < Cl 2 fc+2/ s . 

TeT s 

Note that if T E T s then \I T \ < \I S \ and 2 k I T n 2 k I s # 0, so clearly I T C 2 k+2 I s . 
Also \ljt\ > l^sl an d wr H u>s 7^ 0, so out of any four trees in Tg at least two of 
them will have overlapping top frequency intervals. The desired estimate (3.38) then 
follows from the fact that the rectangles It x ujt (with T £ T5) are disjoint. □ 

4. The tree estimate 

In this section we prove several estimates for the restriction of the (model) Car- 
leson operator to a tree. Lemma 4.1 is applicable to any tree, while Lemma 4.2 
improves the L 1 case of Lemma 4.1 when the elements of the underlying tree are 
disjoint in the phase plane. 

Recall that for any bitile collection Q we denote 

Cq/(s)= E (f,M)M(x)d P (x) 
PeQ 

with dp defined as follows: First, (dk)k>i ancl are two sequences of measurable 
functions of x, such that 

• For each x there is some integer K = K(x) < 00 such that dk(x) = for 
k > K, and uniform over x we have J2k>o \dk(x)\ r = 1. 

• For any x we have Nq{x) < Ni(x) < 

Then for each x define dp(x) = unless there exists an index k such that N1.—1 ^ wp 
and TVfc € wp 2 , in which case such index is unique and we define dp(x) := dk(x). 
We note that if P e P then 

Jx?M' E \dk\ r 'w <Cw(I P ) density ({P}Y . 

k:N k £ujp 2 

The above observation will be used implicitly below. 



WEIGHTED BOUNDS FOR VARIATIONAL FOURIER SERIES 



1!) 



Lemma 4.1. Let T be a tree. Assume s <E Then there exists some C = 

C(s,w) < oo such that 

(4.1) \\li T gC T f\\ L . {w) <C W (/ T ) 1 / s size(T)density(T) , 

and furthermore for any N > there exists C = C(N,s,w) < oo such that the 
following inequality holds for any k > 0: 

||l 2 ' i + i/ T \2fc/ T .9CT/||L 3 (tu) 

(4.2) < C2~ Arfc w(/ T ) 1 / s size(T)density(T) . 
Remark: As a consequence, we obtain for any s G [1,7"']: 

(4.3) \\gC T f\\L-(w) <Cu;(/ T ) 1/s size(T)density(T) . 

Proof. By Holder's inequality and using the doubling property of w it suffices to 
show (4.1) and (4.2) for s = r' , and this will be assumed in the rest of the proof. 
By dividing T into two subtrees, if necessary, we can assume that the tree is either 
2-overlapping or 2-lacunary. We will return to this distinction below. 

Proof of (4.1): We will prove a stronger estimate, where the restriction 1/ T is not 
required. Let J be the set of maximal dyadic intervals such that 

I P <jL 3J 

for any P e T. It is not hard to see that J partitions E. Let 

(4.4) Tj :={PeT: \I P \ < C 4 \J\} , 

some absolute constant C4 > 4 to be chosen later. The left hand side of (4.1) (with 
s = r' now) is bounded above by A + B where 

l/r' 



1/r' 



(4-5) A := f \gC Tj f\ r ' 

JeJ J 

(4-6) B := (W \gC nTj f)\ r 

JeJ J 

To bound A, we fix J € J and first estimate the contribution of each P € Tj: 
( / \gC {P} ff w ) l/r ' < C N ^0{ f \ljgxl +D dp\ r w)^' 



\Ip\ 

< Cw{I P ) l ' r ' sizc({F})dcnsity({P})supx/,(2/) Ar . 
Using the triangle inequality, it follows that 

, \l/r' 

\gC Tj f\ r wj 

(4.7) < Csizc(T) density (T) V w{I P f' r ' (1 + dlSt , ( JJp) )~ AN . 

By the j4oo property of w there exists constants /3q > such that if I C I' are two 
intervals then 

< C( 1/1 )*> 
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Without loss of generality, we may choose the doubling constant 7 in (1.5) to be 
large enough such that 7 > /3q. 

For any P € Tj we can find an interval K of length comparable to \Ip\ + | J| + 
dist(J, Ip) that contains both Ip and J. Since |/p| = 0(|J|) we can choose K to 
be a dilation of J. We then have 

w{ip) w{ip}w{iq r( M^o ( M v 

w(J) w{K) w(J) - K \K\ J [ \J\> 

Ml Kp| 

Therefore by choosing iV sufficiently large it follows from (4.7) that 

l/r' 



\gC Tj f\ r w 

<C7size(T) density (T) £ (JM)*/r w( J)^' (1 + ^Wp) } _ 



= Csize(T) dcnsity(T)u)(J) 1/r ' E E 2-^(1 + dist ( J >^) )-3iV ^ 

fc>-i |/ P |=2- fc |j| ' p ' 

Using the fact that 3 J does not contain any Ip, P € Tj, and the fact that elements 
of Tj of the same size are spatially disjoint, it is not hard to bound the last display 

by 

< Csizc(T) density (TM J) ^'(l + ^dillly™ 

\J-T\ 

Thus, we can bound A by 

A < Csizc(T)dcnsity(T)(E^)(l + dis ^ J ' Jr) y 2Nr ') ^ . 

Note that by definition 3 J docs not contain It- It follows that for any x E J 

dist(J,J T ) k-c(J T )[ 
|J T | ~ + \I T \ 

Choosing N large and using disjointness of J's, wc obtain 

E «W + ^^)— ' < C / X?> < Cw(I T ) . 
Consequently, we have 

A < Cw{I T ) 1/r ' size(T) density(T) . 
To bound B, let Fj = Utgtxt, w Pj > wc fi rs t show that 

(4.8) / | ff r' E l^| r '«<C W (J)[density(T)]'-' . 

Jj j-.Nj&Fj 
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Proof of (4.8). We construct O(l) non-empty subtrees of T such that Fj is con- 
tained inside the union of the frequency intervals of these trees. The top interval 
of each such subtree will be of length ~ | J| and will be contained in some 0(1) 
dilation of J. Clearly, (4.8) follows as a consequence of this construction. 

To construct these trees, first we construct their (common) top interval Jo- Let 
7r(J) be the dyadic parent of J. Then we can find Q £ T such that Iq C 3tt(J), 
therefore we can select a dyadic interval Jo such that 

I Q C Jo c3i(J) , | Jo | > |J| . 

Now, note that by dividing T into three trees if necessary, we may assume without 
loss of generality that only one of the following scenarios happens: 

(i) (tE^Pj for every P £ T, or 

(ii) £t < inf 0Jp 2 for every P £ T, or 

(iii) £t > supwp 2 for every P £ T. 

In each of these scenarios, one tree will be constructed. The desired tree has only 
one element Q and has top data (Jq,u>q), and wo is constructed below: it will be 
shown that 

(4.9) Fj CwoCuq . 

We note that by choosing C4 large in the definition (4.4) we can ensure that for 
any P £ T\Tj we have |wp| < 1/| Jo|- Furthermore, if C2 > 1 we can also ensure 
that |w F | < %^|J |. 

If (i) is satisfied, we let ojq be the dyadic interval of length l/\ Jq\ containing 
It is clear that for any P £ T \Tj we have ujp 2 C ujq and wo C wq 2 , and (4.9) 
follows immediately. 

If (ii) is satisfied, we let ujq = [£r, £r + jjn)- Since for any P £ T\Tj we have 
\ojp\ < I wo I < |wq 2 |, it follows that we always have wp 2 C wo C wq, as desired. 

If (iii) is satisfied, we let wo = [£t — JXT'^ T )' an< ^ ar g uc d as in situation (ii). 

This completes the proof of (4.8). □ 

Below we return to our task of estimating B. We remark that any J £ 3 that 
contributes to B must satisfies | J| < \It\/C4 < |It|/4, therefore J C 3It- We now 
consider two cases: 

Case 1: T is 2-lacunary: By ensuring that the constant Kq in the separation as- 
sumption (S3) is sufficiently large, it follows that for P,P' £ T with \Ip\ > \Ip>\ we 
have ujp 2 Cup. Using the fact that {Nj(x)} is an increasing sequence for every 
x, it follows from a geometrical consideration that for each x there is at most one 
m and such that dp(x) 7^ for some P £ T with |Jp| = 2™. Here it is important 
that the limiting condition reads {Nj—i (jL ujp^Nj £ wp 2 }. Now, uniformly over m 
we have 

E (i+^^)- 2 = o ( i) . 

PeT:\I P \=2 m ' P ' 

It then follows from (4.8) that 

\\1j9Ct\tJ\\ l ,' (w) < Csnp ^^ (f W' sup \d/ w)^' 

P€ T \lp\ L, ' i Jj k:N k £u, To 

< Csize(r)dcnsity(T)u;(J) 1/r ' 
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consequently we obtain the desired estimate: 

B < c(^2w(J)y /r sizc(T) density (T) 

JEJ 

< Cw(I T ) 1/r ' size(T) density(T) . 
Case 2: T is 2-overlapping: We estimate pointwisc 

\Ct\tJ{x)\ 



< 



Ei E (/,<M)<Mr) 1/r ( E K(*)r') 1/r 



Therefore 

(4-10) WUgCjyTjhr'M 



<Cw(J)V r ' density(T)sup(El E (f,M)M 

j PeT\T. J :N J - 1 ^p,N j euip 2 



Note that for any P the frequency support of <f>p 1 is contained inside C^ujp 1 = 
(1 — c)C2top 1 for c = 1 — ^ 6 (0, 1) which is uniform over P's. Recall that T is 
a 2-overlapping tree and the relative position of the tiles in each bitile are uniform 
over P. 

Now, by choosing the constant K in the separation assumption (S3) to be suffi- 
ciently large, we can find a lacunary family of smooth Littlewood-Paley projection 
operators II „ such that: II n is a smooth Fourier multiplier operator whose symbol 
is supported in {|£| = 0(2")}, and furthermore (thanks to separation) Iinllfc = life 
for any n < k and (f>p 1 = (IT,, — H n -i)(j)p 1 for n = log 2 \Ip\- 

It follows that for any x £ J we can bound 

Ei E (f,M)M\ r ) 1/r 

k PeT\T J :N k _ 1 <tujp,N k eujp 2 
K 

< sup Ei^-vJrif' 

K,no<-<nK<0(l)-log 2 \J\ J=1 

where gx '■= J2per (/> 4>Pi)4>Pi ■ The last display can be rewritten as 

K 

sup (Ein log2 ui(n„ 3 -n n ._ 1 ) 5T | r ) 1/r 

K,n <— <nK<0(l)-log 2 |,/| J=1 
if 

<a/j( sup (Ein„ J 5T-n„ j _ l5T r) 1 /'-) 

K", n <"'<hk j =1 

using Minkowski's inequality and standard arguments. Here, Afj denotes the fol- 
lowing local maximal operator: 



Mjf = SUP Ir f |/| 
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For simplicity we denote by ||fl l T||v r ' the variational expression inside Mj in the 
above estimate. Recall that all the J such that T\Tj are disjoint and contained 
in 3/t- Thus, it follows from (4.10) and the above estimate that 

1 It 1 

B < Cdcnsit y (T)(^ W (J)M, 7 (||. 9T || v ,) r ') 
,/eJ 

<Cdcnsity(T)||l3 /r Af(|| 3T || y ,0llL.'(. tu) 
<Cdensity(T)u;(/ T ) 1 ^-V(2 9 )|| M( || 5T || v , r) || L2?([u) 

since r' < 2 < 2q. Using w £ A q C A2 q and Lemma 5.2 we obtain 

B < Cdensity{T)w{lT) 1,r '- 1,{2q) \\9T\\L^ (w) 
To show the desired bound for B it remains to show that 

||ffr|U*(«) <Cw(lT) 1/{2q) size(T) . 
Take h to be any function in L^ 2q ^ (w) where (2q)' denote the dual exponent of 2q. 

(2q)' 

Let a = w 2 i , since w £ A q c A 2q it is clear that a £ A( 2 q)> ■ We have 



(g T ,wh) = ^2 (f^Pi)( hw ^Pi) 
Per 

/(E ia^)i 2 fe) 1/2 (E \(hw,<t> Pl )\ 2 ^) 1/2 dx 



PET 1 1 PET 

< l|5 , T/||L2, (!i , ) ||S , T (/lw)|| L ( 2g) ' (a) . 

Then using the John-Nirenberg characterization of size in Lemma 3.5 and the esti- 
mate (3.4), it is not hard to see that 

(9T,wh) < Cw(I T ) 1,{2q) sizcCmiMlL^V) 

= C W (/ T ) 1 /^si Z e(T)||/ l || L(2 , V(tu) , 

as desired. 

Proof of (4.2): Let g = gl2 k + 1 i T \2 k i T - Note that it suffices to consider k > 2. One 
proceeds as in the above proof of (4.1) with g in place of g. It suffices to observe 
that in the above proof of (4.1) we don't need to consider (4.6) for k > 2 since all 
the J that contributes to this term is contained inside 3ir- Furthermore, any J 
that contributes to (4.5) satisfies 

therefore in the rest of the proof one could easily introduce a decaying factor. □ 

Lemma 4.2. Let T be a tree and suppose that any two bitiles of T are disjoint. 
Then there exists some C = C(w) < oo such that 

(4.11) \\gC T f\\m w ) <C«;(Jr)size(r)<Sty(r) . 

Proof. Clearly the elements of T must be spatially disjoint using the separation 
assumption on P and the fact that T is a tree. Thus, by the triangle inequality 
it suffices to show (4.11) for any single-element tree, but the improved L 1 tree 
estimate is clear for these trees. □ 
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5. Weighted variational inequalities for Littlewood-Paley families 

In this section, we prove weighted extensions of a variational inequality for for 
Littlewood-Paley families [1,10,14,20]. Note that the dyadic variant of Lemma 5.2 
below was proved in [3]. 

Definition 5.1. Fix an absolute constants C G (l,oo), and {Cn : N € N}, 

m > 1. A sequence of functions (fj)jez is a Littlewood-Paley family each each fj 
has frequency support inside {<^2 - -' < |£| < C2~ J }, and 

|^//WI<C Ar 2-^[l + |x|2^]- m 

Lemma 5.2. Let 1 < p < oo, w 6 A p and r ^ 2. Let s = min(r, 2). Then for any 
Littlewood-Paley family (fj) we have 

K 

(5.1) ii sup C£\ E /il r ) 1/r ll^(»)< c ll(El^l s ) 1/s ll^w ■ 

K,N <-<N K t{ N k ^<N k j 

Proof. Let be Littlewood-Paley projection of / into an enlarged frequency range 
< |£| < 2C2~°}, such that Aj/j = fj. It then suffices to show that for 
any w € A p and any family of Littlewood-Paley projections (A,,) and any vector 
valued function f = (/j)jeZ we have 

K 

(5.2) ii sup (]T| £ A J f J n i/r \\L, {w) <c\\(Y / \f 3 n 1/s \\L PM ■ 

K,N <-<N K k=1 Nk _f?j< Nk 

Let Tf denote the variational operator inside ||.|| in the left hand side of (5.2). 
Then it suffices to show the following pointwisc bound for the dyadic sharp maximal 
function of Tf : for any 1 < t < oo, 

(5.3) (T{)Hx)<M t (i)(x) , |f| = (lEl/.f) 1/S . 

3 

Indeed, since w £ A p this will imply that 

\\Tf\\ LP{w) < C\\(Tff\\ LP(w) < C\\M t (f)\\ LP(w) . 

We now take 1 < t < p sufficiently small such that w G A p / tl and the desired 
estimate (5.2) then follows: 

\\Tf\\ L , {w) < C\\M t (i)\\LP {w ) < C\\f\\ LP(w) . 

It remains to show (5.3), and we use an argument from [4]. Take any dyadic 
interval I containing x. Let Cj be a constant defined as follows: 

[ 0, otherwise 
where <f>j is the corresponding convolution function of Aj. Then let 

c I= sup (£| E c iD 1/r 

K,N <-<N K k Nh _ 1<j:£Nk 



then it is not hard to see that 

sup (£\ E A ^n 1/r 

K,N <-<N K k Nk _ lKj < Nk 
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< sup (£\ E (^7i-c,-)l r ) 1/r 

K,N„<-<N K k Nk _ 1<3 < Nk 

We then decompose 

Ajfj -Cj =gj + bj 

where 

((O.A^-c,) 2^ 
[(Aj(/jl3/), Aj-(/,-l( 3 /)o)), otherwise 
It is not hard to see that for any y £ / we have 

|6 3 -(tf)| < C^i/jC^min [(^|/|) 6 ,(2*|I|)- 

The parameter e > here depends on the decay of <J)j and its derivative. Now, by 
Holder's inequality and the known Lebesgue case 4 of (5.2), we have 

w\l sup (El E A ^n 1/r 

\I\ JlK,N <-<N K ^ Nk _f^< Nk 

<ii ^p c£\ E A ^r) 1/r n t 
* c ^ikE fei s ) 1/s ii* < ]7p*ii(E i/;W) 1/s ii* < ■ 

On the other hand, 



mf su ? (El E N r ) 1/r <^r [j2\bM\dy 

\I\ JlK,N <-<N K V N h -tZj<N h ^l J/ V 

<C^ m in[(2^|/|)%(2^|/|)- e ]Mi/ J -(a:) 



< CsupMi/ 3 -(x) < CA<i(f)(i) < CM t (f)(x) 



□ 



6. The main argument and proof of Proposition 2.1 

Without loss of generality assume that w(F) > and w(G) > and 

ma,x(w{F),w{G)) = 1 . 

Recall that our aim is to find major subsets of F and G respectively such that at 
least one of them has full measure, and if |/| and \g\ are supported inside these sets 
and bounded above by 1 then 

(6.1) Bp(f,g) < Cw(F) 1 l p w(G) 1 ~ 1 l p 

for all p <E {q,oo) such that 1/r > 1/q— 1/p- The major subsets will be chosen 
using the weighted maximal function, see its definition in Section 1.1. 

Case 1: w(F) < w(G). 



We choose F = F and G = G\fl with 

ft := {Mi, w 1f > Cw{F)} 



4 Note that in the Lebesgue case, (5.2) is equivalent to (5.1) thanks to boundedness of the 
vector valued maximal function, this was observed in [4]. 
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and C < oo is sufficiently large such that w(ft) < 1/2. 

Fix qo € (9,00) very close to q. We use the following estimate whose (rather 
standard) proof is included later: 

Lemma 6.1. For any r\ 6 (-^ a , 1) there is a positive constant e = e(rj,qo,r) > 
such that 

(6.2) B F (f,g) < Csize(P) 1 -''dcnsity(P) e u;(F) T ' /(2<?o) . 

Furthermore, if the elements of P are disjoint in the phase plane then a stronger 
variant of (6.2) holds where density(P) is used in place of density(P). 

Below we show how Lemma 6.1 implies the desired estimate (6.1) using an ar- 
gument from [17, 18]. We decompose the original P = Ufc>o ^ where 

P W = { p e P : 2 fe < 1 + dist ^ p ;" c) < 2 fc+1 } . 

Observe that if P € then 2 k+2 I P n ft c ^ 0. Therefore, using Lemma 3.4 we 
obtain 

(6.3) size(P [A;I ) < C2° {k) w{F) 1/q . 
On the other hand, it is not hard to see that 

d'e^sity(P [fcl ) < C2-° k/2 . 

Now, observe that if k > 1 then the collection Pl fe l can be decomposed into 0(1) 
bitile subcollections, such that for any two P 7^ P' in each subcollection we have 
Ip x cup n Ipi x topi = 0. To see this, note that for k > 1 the length of any nested 
sequence m {I P : P G P [fe1 } must be O(l). It then follows that we can decompose 
p[ fc l into O(l) subcollections, in each collection the spatial intervals Ip of two bitilcs 
are either the same or disjoint, and via another decomposition (to ensure that any 
two different bitiles sharing the same spatial interval are far from each other in 
frequency) we can obtain O(l) subcollections with the desired properties. 

Thus, for the purpose of proving (6.1) we may assume without loss of generality 
that for k > ko the elements of p[ fc l are disjoint in the phase plane. For those k we 
have 

B PW (f,g) < (7size(P [fcl ) 1 -^[de^ity(p[ fc ])] e u;(F)''/( 2,J0 ) 
< C2- Df - k / 2 size(p[ fc ]) 1 -"w( J F 1 ) r '/( 2 «°) (since supp(g) C 



< C2~ Dek/2 



i-n 



l° {k) w(F) 1/q w{Fy l,{2qo) . 

Choosing D large in the definition of density (certainly D depends on q,qo,r,w) 
we obtain 

Bp W (/,ff) < C2- ek w(F) ( - 1 - r >V q+r i/(- 2 <io) t k>k() 

On the other hand for < k < ko disjointness may not be available, and we 
only have density(Pl fc l) = 0(1), but since fco = 0(1) we also have sizc(PM) = 
0{w{F) 1 / q ) from (6.3). Using a similar argument as before, we obtain 

Bp{k](f,g) < Cw(F) ( - 1 ~ v ^ q+,, / ( - 2qa ^ , fc<fc . 

Thus, summing the above estimates over k > we obtain 

B P (f,g) < Cw(F) {1 - ,l)/q+v/{2qo) . 
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For any p such that 

1 1 1 

p q r 

we can choose go sufficiently close to q and 77 sufficiently close to 2qo/r (keeping 
1 > 77 > 2<7o /r and qo > q) such that 

(l-ri)/q + r]/(2q )>l/p . 

The desired estimate (6.1) now follows immediately, using w(F) < 1. 

B P (f,g)<Cw(F) 1 /P . 

Proof of Lemma 6,1: 

We show only the general case when P is arbitrary. The proof of the disjoint 
case when any two elements of P are disjoint is completely analogous. The main 
difference is the use of the improved tree estimate (Lemma 4.2) in place of the 
standard tree estimate (Lemma 4.1). 

For convenience, we denote S\ = size(P), E\ = w(F) 1 '( 2qo > and D\ = density(P). 
Using Lemma 3.7 and Lemma 3.11 we can decompose P = Unez where each 
P„ is union of trees inside a tree collection T n , such that 

w(J T ) < C2 n , 

T6T„ 

size(P„) < C2- n/{2qa) E 1 , density(P„) < 2' n/r ' . 
It then follows from the tree estimate (4.3) (applied with L 1 norm) that 

B P (f,g)<Cj2 Yl ^(^)size(T)density(T) 

ra£ZTeT„ 



< C^2 n min(5i,2"^i;i)min(£)i ! 2" 



nfcl 



It follows that for a, /3 £ [0, 1] we have 

Bp(f,g) < CSiDi ^ min(l,2"^ J B 1 5 , f 1 ) a! min(l,2-" /r 'Z)i- 1 )^ 



< CSxDx ^ 2™ min (l, 2~ nK \E x j S x ) a D\ 

n£Z 

Under the assumption r > 2q we can choose qo > q such that r > 2q$. Then we 
can find a, B £ [0, 1] such that 

a B 

6.4 — + ^>1 • 

2g r 

We then obtain a two-sided geometric series which is bounded above by its largest 
term. Thus 

B P (f,g) < CS l D 1 {E 1 /S 1 ) a / K D- f3/K = CS\~ alR ' E a J K : D\- p/K . 

Let 77 = a/K, we have 77 £ (= 22 -, 1) and in fact varying a, /3 £ [0, 1] respecting the 
condition (6.4) we can obtain any value of 77 in (2qo/r, 1). Furthermore 

, If, I l r ' , 2(7 , „ 
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giving the desired estimate (6.2). This completes the proof of Lemma 6.1. 

Case 2: w(F) > w{G). We will choose G — G and F = F\Q where 

n = {M hw l G > Cw(G)} 

where C < oo is sufficiently large such that w(Sl) < 1/2. We will use the following 
estimate, whose proof is included later: 

Lemma 6.2. Suppose that density(P) < Mw(G) 1 ^ r for some M > 1. Then for 
any p < oo there exists a constant 5 = 8{p, q, w,r) > such that 

(6.5) B P (/,<7) < CM size(P) 5 w(G) 1/p '- 1/r ' . 

Below we show how Lemma 6.2 implies the desired estimate (6.1). Decompose 
P into {J h>0 P [h] where 

pp.] = {P e P . 2 h < i + dist ( /p ;" c) < 2 h+1 } 

Up I 

We verify below that 

(6.6) density (pM) < C2°W \ sup (M ltW l G )(x)] ^ < C2°^w{G) 1/r ' , 

here the implicit constant in OQi) depends on the doubling exponent 7 of w. Indeed, 
let T be any non-empty tree in P W . Then it is clear that 

dist(i Tl n c ) h+1 

We then enlarge It by a factor of 0{2 h ) to obtain an interval J such that JDfl c 7^ 0, 
clearly w( J) < C2 ^w(I T ) and therefore 



(— 77-T / X?M w ) 1/r ' < C2°W\m M 1 , w 1 g (x) 
w(It) J L^eJ 



l/r' 



from which the estimate (6.6) follows immediately. 

On the other hand, using supp(f) C f2 c , it follows from Lemma 3.4 that 

sizc(P [ ' 11 ) < C N 2~ Nh 

for any N > 0. Take iV" very large in the above estimate, it follows from (6.5) and 
(6.6) that 

Bpw (/,<?) < C2- h w(G) 1/r 'w(G) 1/p '- 1/r ' = C2~ h w(G) 1/p ' , 
and (6.1) now follows from summing these estimates over h > 0. 

Proof of Lemma 6.2: Fix qo G (q, 00). Using Lemma 3.7 and Lemma 3.11, we can 
decompose P = Unez wnere Pn is the union of trees from a tree collection T n , 
such that 

TeT„ 

size(P„) < C2~^ , density(P„) < C2- n l r ' ' w{G) 1/r ' . 
We use Lemma 3.7 again and decompose P n = Um>o ^n.m where ~P n _ m is the 
union of trees from a tree collection T njm such that 

size(P„, TO ) < C2-(™ + ™)/( 2 «°) , 
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TeT„, m TeT„ 
W \\lp(w) < C2° (k) 2 n+m w(F) 1/p = C2 0(k) 2 n+r ' 



In particular, it follows from the doubling property of w that 

ii E wiUM-)< c ' 27fe2 " ■ 

reT„ m 

By interpolation, it follows that for any 1 < p < oo and any e > we have 

(6.7) || h*i T \\L,-*( w )<C2°W2 m ^2 n . 

TeT„ >m 

Here, the implicit constant in 0{k) may depend on p, e, u>. For convenience, for any 
k > let denote the counting function 

Decomposing 1 = 1/ T + X)fc>o(^2 fc + 1 / T — l2 fe i T ) f° r each T and applying Holder's 
inequality, we obtain 

B Pnim (f,g)<C B k (n,m) 
fe>-i 

S_i(n,m):= /(^L) 1/r ( £ |l/r3CT/r') 1/r '^a: , 

B fc (n,m):= f {N^ -N^) 1 '^ £ l^+^/^T/r') 1 ^ wdi , fc>0 . 

TeT„_ m 

Estimate for _B_i(n, m): Fix p < oo very large and e > very small, such 

that in particular p — e > r. Apply Holder's inequality we obtain 

B-i(n,m)<C||(iVlo) n ) 1/r ||L,-M-)ll( E l 1 ^T/r') 1/r '|U- e)>) • 

reT„,,„ 

Using (6.7), the first factor can be rewritten and estimated by 



V li ll 1/r < C2 n/r 2 { r-^ 1 



TeT„, m 

using p — e > r. The second factor is supported inside supp(g) C G, thus it can be 
bounded above by 



<C W (G)T^F-^\\( J2 \li T gC T f\ r ') 1/r '\\ Lr 



t<et„. 



TET„, m 

Using the tree estimate (4.1), we can bound the above expression by 

l/r' 



<Cw(G)T^P P [ u,(/ r )J r size(P„, m )dcnsity(P„, m ) 

< Ci^G)^)^ [2^] [2" (n+m)( ^- <5) size(P) 15 ] min (2- ,l / r ' w(G) 1/r ' , Mw(G) 1/r '^ . 
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here d £ (0, ^) very small to be chosen later. Since M > 1, it follows that 

Y S_i(m,n) 

m>0 

< CMw{G) ^=JF size(P) 5 ^ [2-2 ( -"p } 

m>0 

Since r > 2q we can always choose qo > q such that r > 2go, and then choose 5 > 
depends on qo,r such that 

i-(J--£)<0 , 
r 2q 

which implies 1/r — 1/p — l/(2qo) + S < 0. Therefore the above summation over 
to > converges, and 

Y Y B-i(m,n) < CMw(G)(^size(P) 5 ^2" ( -~^ +A ~Vin(l,2^) . 

n m>0 n£Z 

Since 

1 1 ^_ 1 1 1 

r 2g r 2q r' 

we can refine our previous choice of 6 = S(qo, r) > such that the above estimate 
of X)m>o B-i(m,n) remains a two-sided geometric scries. It follows that 

sizc(P) 

n rn>0 

Since we can choose p < oo arbitrarily large and since w(G) < 1, it follows that 

size(P) 

n m>0 

for any p < oo . 

Estimate for Y) Bk(n, m): The argument is similar to the above estimate for the 

sum of B_i(n, to), with the following difference: we will collect some power 2 k , and 
we will gain the decay factor 2~ Nk from the tree estimate (4.2) where N could be 
chosen arbitrarily large. We obtain, via a similar argument and by choosing N 
large enough, the following estimate 

Y Y B k( m > n ) < C2- k Mw(G) 1/p ' size(P)' 5 

n m>0 

for any p < oo . 

Summing over k > — 1, we obtain the desired estimate (6.5). This completes the 
proof of Lemma 6.2. 
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